Model Selection

Multimodal text-image understanding

# Multimodal text-image understanding

Gemma 3 12b It Qat GGUF

Gemma is a lightweight, advanced open model series from Google, built using the technology behind the Gemini models. Gemma 3 is a multimodal model capable of processing both text and image inputs to generate text outputs.

Gemma 3 4b It Qat Q4 0 Unquantized

Gemma 3 is a lightweight open-source multimodal model introduced by Google, built on the same technology as Gemini, supporting text and image inputs to generate text outputs.

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase